Implement `reverse` performance optimization #4775

ahkcs · 2025-11-10T23:00:13Z

Description

Originally from #4056 by @selsong

This PR implements a significant performance optimization for the reverse command by eliminating the expensive ROW_NUMBER() window function and implementing a three-tier logic based on query context.

Motivation

The previous implementation used ROW_NUMBER() window function which:

Required materializing the entire dataset
Caused excessive memory usage
Failed on large datasets (100M+ records) with "insufficient resources" errors

Solution: Three-Tier Reverse Logic

The reverse command now follows context-aware behavior:

With existing sort/collation: Reverses all sort directions (ASC ↔ DESC)
With @timestamp field (no explicit sort): Sorts by @timestamp in descending order
Without sort or @timestamp: The command is ignored (no-op)

Implementation Details

1. Reverse with Explicit Sort (Primary Use Case)

Query:

source=accounts | sort +balance, -firstname | reverse

Behavior: Flips all sort directions: +balance, -firstname → -balance, +firstname

Logical Plan:

LogicalSystemLimit(sort0=[$3], sort1=[$1], dir0=[DESC-nulls-last], dir1=[ASC-nulls-first], fetch=[10000], type=[QUERY_SIZE_LIMIT])
  LogicalProject(account_number=[$0], firstname=[$1], ...)
    LogicalSort(sort0=[$3], sort1=[$1], dir0=[DESC-nulls-last], dir1=[ASC-nulls-first])
      CalciteLogicalIndexScan(table=[[OpenSearch, accounts]])

Physical Plan: (efficiently pushes reversed sort to OpenSearch)

CalciteEnumerableIndexScan(table=[[OpenSearch, accounts]],
  PushDownContext=[[..., SORT->[
    {"balance": {"order": "desc", "missing": "_last"}},
    {"firstname.keyword": {"order": "asc", "missing": "_first"}}
  ], LIMIT->10000]])

2. Reverse with @timestamp (Time-Series Optimization)

Query:

source=time_series_logs | reverse | head 100

Behavior: When no explicit sort exists but the index has an @timestamp field, reverse automatically sorts by @timestamp DESC to show most recent events first.

Use Case: Common pattern in log analysis - users want recent logs first

Logical Plan:

LogicalSystemLimit(sort0=[$0], dir0=[DESC], fetch=[10000], type=[QUERY_SIZE_LIMIT])
  LogicalProject(@timestamp=[$0], category=[$1], value=[$2])
    LogicalSort(sort0=[$0], dir0=[DESC])
      CalciteLogicalIndexScan(table=[[OpenSearch, time_data]])

3. Reverse Ignored (No-Op Case)

Query:

source=accounts | reverse | head 100

Behavior: When there's no explicit sort AND no @timestamp field, reverse is ignored. Results appear in natural index order.

Rationale: Avoid expensive operations when reverse has no meaningful semantic interpretation.

Logical Plan:

LogicalSystemLimit(fetch=[10000], type=[QUERY_SIZE_LIMIT])
  LogicalProject(account_number=[$0], firstname=[$1], ...)
    CalciteLogicalIndexScan(table=[[OpenSearch, accounts]])

Note: No sort node is added - reverse is completely ignored.

4. Double Reverse (Cancellation)

Query:

source=accounts | sort +balance, -firstname | reverse | reverse

Behavior: Two reverses cancel each other out, returning to original sort order.

Logical Plan:

LogicalSystemLimit(sort0=[$3], sort1=[$1], dir0=[ASC-nulls-first], dir1=[DESC-nulls-last], fetch=[10000])
  LogicalProject(account_number=[$0], firstname=[$1], ...)
    LogicalSort(sort0=[$3], sort1=[$1], dir0=[ASC-nulls-first], dir1=[DESC-nulls-last])
      CalciteLogicalIndexScan(table=[[OpenSearch, accounts]])

Final sort order matches original query: +balance, -firstname

5. Multiple Sorts + Reverse

Query:

source=accounts | sort +balance | sort -firstname | reverse

Behavior: Reverse applies to the most recent sort (from PPL semantics, last sort wins).

Logical Plan:

LogicalSystemLimit(sort0=[$1], dir0=[ASC-nulls-first], fetch=[10000])
  LogicalProject(account_number=[$0], firstname=[$1], ...)
    LogicalSort(sort0=[$1], dir0=[ASC-nulls-first])
      CalciteLogicalIndexScan(table=[[OpenSearch, accounts]])

Result: Only firstname sort is reversed (DESC → ASC). The balance sort is overridden by PPL's "last sort wins" rule.

Related Issues

Resolves #3924

Check List

New functionality includes testing.
New functionality has been documented.
New functionality has javadoc added.
New functionality has a user manual doc added.
API changes companion pull request created.
Commits are signed per the DCO using --signoff.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

dai-chen

QQ: I recall the major comment on original PR is early optimization in analyzer layer. Is this new PR trying to address the concern? Ref: #4056 (comment)

ahkcs · 2025-11-11T22:03:04Z

QQ: I recall the major comment on original PR is early optimization in analyzer layer. Is this new PR trying to address the concern? Ref: #4056 (comment)

Hi Chen, I think that's a valid concern. However, after trying it out, I think it has significant complexity comparing to the current approach. I think CalciteRelNodeVisitor is used as a logical plan builder that constructs the logical representation of the query, so I think optimization can also happen here. In our approach, our visitReverse is choosing LogicalSort(reversed) vs LogicalSort(ROW_NUMBER), and I think this is appropriate for logical plan builder. If we moved the optimization to Calcite rule, we'd be doing something more complex - starting with a naive representation (always ROW_NUMBER) and rewriting it. That adds significant complexity.

noCharger

Can you add benchmark results on before VS after?

LantaoJin · 2025-11-12T15:12:33Z

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java

+      // Fallback: use ROW_NUMBER approach when no existing sort
+      RexNode rowNumber =


Just an idea: How about ignore the reverse command if there is not existing collations or no IMPLICIT_FIELD_TIMESTAMP in rowType.
If there is IMPLICIT_FIELD_TIMESTAMP in rowType, no need to add ROW_NUMBER

Thanks for the suggestion! I think it makes sense since reverse command may not have meaningful semantics if there's no ordering.

Updated the implementation: Now reverse command is ignored if no collations/@timestamp found
Updated the test cases and documentation as well

@selsong

This commit optimizes the `reverse` command in the Calcite planner by intelligently reversing existing sort collations instead of always using the ROW_NUMBER() approach. Key changes: - Added PlanUtils.reverseCollation() method to flip sort directions and null directions - Updated CalciteRelNodeVisitor.visitReverse() to: - Check for existing sort collations - Reverse them if present (more efficient) - Fall back to ROW_NUMBER() when no sort exists - Added comprehensive integration test expected outputs for: - Single field reverse pushdown - Multiple field reverse pushdown - Reverse fallback cases - Double reverse no-op optimizations This optimization significantly improves performance when reversing already-sorted data by leveraging database-native sort reversal. Based on PR opensearch-project#4056 by @selsong Signed-off-by: Kai Huang <[email protected]>

Signed-off-by: Kai Huang <[email protected]>

ahkcs · 2025-11-12T23:17:59Z

Can you add benchmark results on before VS after?

Test is done against big5_v2_first100m dataset(The first 100m docs of big5)

Before: Query failed to execute even with head 10

Query: source=big5_v2_first100m | sort +`@timestamp` | reverse | head <size>

  Running single_sort_reverse_10 (5 iterations)... ERROR: {
  "error": {
    "reason": "There was internal problem at backend",
    "details": "java.sql.SQLException: exception while executing query: insufficient resources to run the query, quit.",
    "type": "RuntimeException"
  },
  "status": 500
}

After:

=== Test 1: Single Field Sort + Reverse ===
Query: source=big5_v2_first100m | sort +`@timestamp` | reverse | head <size>

  Running single_sort_reverse_10 (5 iterations)... Done
  Running single_sort_reverse_100 (5 iterations)... Done
  Running single_sort_reverse_1000 (5 iterations)... Done

Size         Avg (ms)   P50 (ms)   P90 (ms)   Min (ms)   Max (ms)
---------- ---------- ---------- ---------- ---------- ----------
10                496        495        502        490        502
100               501        498        516        495        516
1K                577        576        586        571        586

=== Test 2: Single Field Sort DESC + Reverse ===
Query: source=big5_v2_first100m | sort -`@timestamp` | reverse | head <size>

  Running single_sort_desc_reverse_10 (5 iterations)... Done
  Running single_sort_desc_reverse_100 (5 iterations)... Done
  Running single_sort_desc_reverse_1000 (5 iterations)... Done

Size         Avg (ms)   P50 (ms)   P90 (ms)   Min (ms)   Max (ms)
---------- ---------- ---------- ---------- ---------- ----------
10                422        423        424        419        424
100               436        438        439        434        439
1K                515        507        548        503        548

=== Test 3: Multi-field Sort + Reverse ===
Query: source=big5_v2_first100m | sort +`host.name`, -`@timestamp` | reverse | head <size>

  Running multi_sort_reverse_10 (5 iterations)... Done
  Running multi_sort_reverse_100 (5 iterations)... Done
  Running multi_sort_reverse_1000 (5 iterations)... Done

Size         Avg (ms)   P50 (ms)   P90 (ms)   Min (ms)   Max (ms)
---------- ---------- ---------- ---------- ---------- ----------
10                610        587        699        582        699
100               601        599        612        590        612
1K                665        666        667        661        667

=== Test 4: Double Reverse (should cancel out) ===
Query: source=big5_v2_first100m | sort +`@timestamp` | reverse | reverse | head <size>

  Running double_reverse_10 (5 iterations)... Done
  Running double_reverse_100 (5 iterations)... Done
  Running double_reverse_1000 (5 iterations)... Done

Size         Avg (ms)   P50 (ms)   P90 (ms)   Min (ms)   Max (ms)
---------- ---------- ---------- ---------- ---------- ----------
10                441        423        514        420        514
100               436        437        445        427        445
1K                506        502        521        502        521

=== Test 5: Sort + Reverse with Filter ===
Query: source=big5_v2_first100m | where `host.name` != 'unknown' | sort +`@timestamp` | reverse | head <size>

  Running filter_sort_reverse_10 (5 iterations)... Done
  Running filter_sort_reverse_100 (5 iterations)... Done
  Running filter_sort_reverse_1000 (5 iterations)... Done

Size         Avg (ms)   P50 (ms)   P90 (ms)   Min (ms)   Max (ms)
---------- ---------- ---------- ---------- ---------- ----------
10                532        511        612        507        612
100               518        520        532        507        532
1K                589        587        598        585        598

=== REVERSE COMMAND PERFORMANCE TEST COMPLETE ===

LantaoJin · 2025-11-13T03:11:03Z

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteReverseCommandIT.java

+  }
+
+  @Test
+  public void testReverseWithTimestampField() throws IOException {


Can you add some ITs for streamstats? the streamstats always sort on __stream_seq__ to keep the output ordering as same as input's. We'd better add some ITs to verify the pipe streamstats | reverse

Added ITs for streamstats, since reverse command only works when there's a detectable collation now, using streamstats only will make reverse command ignored. Do we want to keep it this way?

No, the streamstats contains collation __stream_seq__ ASC

Signed-off-by: Kai Huang <[email protected]>

ahkcs marked this pull request as ready for review November 10, 2025 23:05

ahkcs mentioned this pull request Nov 10, 2025

Implement reverse performance optimization #4056

Open

7 tasks

ahkcs force-pushed the feat/reverse_optimization branch from 4483045 to 0246535 Compare November 10, 2025 23:22

dai-chen reviewed Nov 11, 2025

View reviewed changes

ahkcs requested a review from dai-chen November 11, 2025 22:29

noCharger reviewed Nov 12, 2025

View reviewed changes

LantaoJin added the enhancement New feature or request label Nov 12, 2025

LantaoJin reviewed Nov 12, 2025

View reviewed changes

ahkcs added 8 commits November 12, 2025 12:07

fixes

3dbb977

Signed-off-by: Kai Huang <[email protected]>

fix UT

474e188

Signed-off-by: Kai Huang <[email protected]>

fix IT

24c5674

Signed-off-by: Kai Huang <[email protected]>

update ExplainIT

80b8b1d

Signed-off-by: Kai Huang <[email protected]>

Add double reverse explain

70e4a6d

Signed-off-by: Kai Huang <[email protected]>

fix ExplainIT

fd86fee

Signed-off-by: Kai Huang <[email protected]>

udpated implementation: ignore reverse if no collation/@timestamp found

39aecf6

Signed-off-by: Kai Huang <[email protected]>

ahkcs force-pushed the feat/reverse_optimization branch from b16218b to 39aecf6 Compare November 12, 2025 21:57

Update doc

5ea90dd

Signed-off-by: Kai Huang <[email protected]>

LantaoJin reviewed Nov 13, 2025

View reviewed changes

add IT for streamstats

1dbb253

Signed-off-by: Kai Huang <[email protected]>

ahkcs requested review from LantaoJin and noCharger November 14, 2025 18:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement `reverse` performance optimization #4775

Implement `reverse` performance optimization #4775

ahkcs commented Nov 10, 2025 •

edited

Loading

Uh oh!

dai-chen left a comment •

edited

Loading

Uh oh!

ahkcs commented Nov 11, 2025

Uh oh!

noCharger left a comment

Uh oh!

LantaoJin Nov 12, 2025 •

edited

Loading

Uh oh!

ahkcs Nov 12, 2025

Uh oh!

ahkcs Nov 12, 2025 •

edited

Loading

Uh oh!

ahkcs commented Nov 12, 2025

Uh oh!

LantaoJin Nov 13, 2025 •

edited

Loading

Uh oh!

ahkcs Nov 13, 2025

Uh oh!

LantaoJin Nov 15, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		// Fallback: use ROW_NUMBER approach when no existing sort
		RexNode rowNumber =

Implement reverse performance optimization #4775

Are you sure you want to change the base?

Implement reverse performance optimization #4775

Conversation

ahkcs commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation

Solution: Three-Tier Reverse Logic

Implementation Details

1. Reverse with Explicit Sort (Primary Use Case)

2. Reverse with @timestamp (Time-Series Optimization)

3. Reverse Ignored (No-Op Case)

4. Double Reverse (Cancellation)

5. Multiple Sorts + Reverse

Related Issues

Check List

Uh oh!

dai-chen left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ahkcs commented Nov 11, 2025

Uh oh!

noCharger left a comment

Choose a reason for hiding this comment

Uh oh!

LantaoJin Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ahkcs Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

ahkcs Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ahkcs commented Nov 12, 2025

Uh oh!

LantaoJin Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ahkcs Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

LantaoJin Nov 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Implement `reverse` performance optimization #4775

Implement `reverse` performance optimization #4775

ahkcs commented Nov 10, 2025 •

edited

Loading

dai-chen left a comment •

edited

Loading

LantaoJin Nov 12, 2025 •

edited

Loading

ahkcs Nov 12, 2025 •

edited

Loading

LantaoJin Nov 13, 2025 •

edited

Loading

LantaoJin Nov 15, 2025 •

edited

Loading